Search Results/Filters    

Filters

Year

Banks




Expert Group











Full-Text


Author(s): 

ARASU A.

Issue Info: 
  • Year: 

    2004
  • Volume: 

    -
  • Issue: 

    -
  • Pages: 

    265-265
Measures: 
  • Citations: 

    1
  • Views: 

    133
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 133

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2023
  • Volume: 

    9
Measures: 
  • Views: 

    66
  • Downloads: 

    10
Abstract: 

Scientists around the world study data mining extensively, but many methods are limited to analyzing small databases. Technological advances have led to the emergence of Incremental Machine Learning and stream data Classification to handle large amounts of diverse data. The challenge is to quickly extract information from incoming sequences of data, but the high speed and complexity of the input data limit the application of previously proposed methods. The Hoeffding tree algorithm is crucial for stream data Classification and employs the Hoeffding bound to select a splitting feature. In this paper, we propose a method that combines an Incremental Decision Tree called the Hoeffding tree with Ensemble machine learning using bagging to enhance accuracy. Our implementation and analysis show that our proposed method improves accuracy compared to the simple Hoeffding tree. We also analyze the algorithm with different numbers of base models and examine graph diagrams to illustrate the improvement in accuracy.

Yearly Impact:   مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 66

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 10
Author(s): 

ABADI D.

Issue Info: 
  • Year: 

    2004
  • Volume: 

    -
  • Issue: 

    -
  • Pages: 

    666-666
Measures: 
  • Citations: 

    1
  • Views: 

    115
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 115

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2023
  • Volume: 

    10
  • Issue: 

    1
  • Pages: 

    35-46
Measures: 
  • Citations: 

    0
  • Views: 

    27
  • Downloads: 

    7
Abstract: 

Outlier detection in data streams is an essential issue in data processing. Today, due to the massive growth of streaming data generated by the spread of the Internet of Things, outlier detection has become a significant challenge. Much progress has been made in outlier detection based on local outlier detection algorithms, such as density-based local outlier factor algorithms, suitable for static data. The incremental version of these algorithms is used to detect the local outliers in streaming data. However, outlier detection in streaming data faces the challenges of limited memory capacity, high execution time, inaccessibility of all data at one time, and changes in data distribution (increasing and decreasing input rates, uncertainty, etc. ). In this paper, we propose a density-based summarization algorithm, which summarizes data, every time the buffer is filled. The proposed algorithm maintains the desired shape of the clusters, with a low computational cost. To this end, larger clusters are selected and the data of their dense areas are reduced so that the shape of the old clusters is not lost. The proposed summarization algorithm reduces execution time and increases precision, recall, and F1 score compared with the evaluated algorithms.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 27

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 7 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Author(s): 

SVEN S. | THOMAS L. | DANIEL S.

Issue Info: 
  • Year: 

    2005
  • Volume: 

    17
  • Issue: 

    -
  • Pages: 

    167-176
Measures: 
  • Citations: 

    1
  • Views: 

    107
  • Downloads: 

    0
Keywords: 
Abstract: 

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 107

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2016
  • Volume: 

    10
  • Issue: 

    4
  • Pages: 

    479-488
Measures: 
  • Citations: 

    0
  • Views: 

    932
  • Downloads: 

    0
Abstract: 

Accurate prediction of river daily discharge is a suitable tool for water resources planning and management.Using models that present explicit equation, such as M5 model trees and Genetic expression programming, causes increase efficiency of these models. In this study, the Galikesh basin as one of most flood prone basins in Gloestan Province is considered for the prediction of river daily discharge. data series used in this study are long term 26 years daily rainfall and river discharge series belong to Galikesh meteorology and hydrometry station.Daily rainfall and river discharge data from 1 to 5 days ahead are used as inputs for prediction by M5 model trees, genetic expression programming and artificial neural network models. The results indicate very good efficiency of the investigated models beside overestimation of the models to predict daily river discharge.Comparison of results of different models leads to selection of M5 model trees as best model among investigated models.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 932

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2011
  • Volume: 

    2
  • Issue: 

    2 (4)
  • Pages: 

    57-70
Measures: 
  • Citations: 

    0
  • Views: 

    1294
  • Downloads: 

    0
Abstract: 

In recent years, Cyber Warfare has been one of the most essential war scenes for intelligence and military organizations. As the volume of data is increased, data stream processing and data security would be the basic requirements of military enterprises. Although, by data encryption no one can have access to data contents, but adversaries can delete some records from data stream or add some counterfeit records in it without any knowledge about records contents. Such attacks are called integrity attacks. Because integrity control in DSMS is a continuous process, it may be categorized in passive defense activities. In this paper, we present a probabilistic auditing model for integrity control of received stream results via an unsecured environment. In our architecture, users and data stream owners are trusted and channels are unsecured. The server is considered a black box and auditing process is fulfilled by contribution between data stream owner and users. We exploit existing data stream Management Systems (DSMS) without any modification. Our method has no significant cost on users side and detects integrity attacks accurately and rapidly. Correctness and convergence of our probabilistic algorithm is proved and our evaluation shows very good results.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1294

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2015
  • Volume: 

    17
  • Issue: 

    1 (64)
  • Pages: 

    77-90
Measures: 
  • Citations: 

    1
  • Views: 

    963
  • Downloads: 

    0
Abstract: 

Introduction: Rainfall-runoff modeling is one of the keystones of scientific hydrology and environmental management. Therefore the researchers continuously try to find new approaches for improvement of existing models or modeling methodologies.Material and Methods: In this paper, daily stream flow at the outlet of a watershed in south western Iran was simulated using a conceptual continuous rainfall-runoff model. In encountering with the problem of poor quality data, required data such as runoff, rainfall and PET were prepared using a specific approach.Results and Discussion: The results showed that the Nash-Sutcliffe efficiency was 0.80 and the coefficient of determination was 0.82 during calibration and the Nash-Sutcliffe efficiency was 0.83and the coefficient of determination was 0.83 during validation. Furthermore statistics of observed stream flow were preserved in simulated stream flow. The results showed that this approach is successfully applicable for daily rainfall-runoff modeling when the quality of the input data is not adequate.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 963

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 1 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 4
Issue Info: 
  • Year: 

    2022
  • Volume: 

    18
  • Issue: 

    4 (50)
  • Pages: 

    153-164
Measures: 
  • Citations: 

    0
  • Views: 

    253
  • Downloads: 

    0
Abstract: 

data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refers to changes in the statistical properties of data, and is divided into four categories: sudden, gradual, incremental, and recurring. Concept drift is generally dealt with by periodically updating the classifier, or employing an explicit change detector to determine the update time. These approaches are based on the assumption that the true labels are available for all data samples. Nevertheless, due to the cost of labeling instances, access to a partial labeling is more realistic. In a number of studies that have used semi-supervisory learning, the labels are received from the user to update the models in form of active learning. The purpose of this study is to classify samples in an unlimited data stream in presence of concept drift, using only a limited set of initial labeled data. To this end, a semi-supervised ensemble learning algorithm for data stream is proposed, which uses entropy variation to detect concept drift and is applicable for sudden and gradual drifts. The proposed model is trained with a limited initial labeled set. In occurrence of concept drift, the unlabeled data is used to update the ensemble model. It does not require receiving the labels from the user. In contrast to many of the current studies, the proposed algorithm uses an ensemble of K-NN classifiers. It constructs a group of clustering-based classification models, each of which is trained on a batch of data. On receiving each new sample, first it is determined whether the data sample is an outlier or not. If the data is included in a cluster, the sample class is determined by majority voting. When a window of the stream is received, the possibility of concept drift is examined based on entropy variation, and the classifier is updated by a semi-supervised approach if necessary. The model itself determines the required data labels. The proposed method is capable of detecting concept drift in data, and improving its accuracy via updating the learning model with appropriate samples received from the stream. Therefore, the proposed method only requires a small initial labeled data. Experiments are performed using five real and synthetic datasets, and the model performance is compared to three other approaches. The results show that the proposed method is superior in terms of precision, recall and F1 score compared to other studies.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 253

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
Issue Info: 
  • Year: 

    2016
  • Volume: 

    4
  • Issue: 

    4
  • Pages: 

    31-42
Measures: 
  • Citations: 

    0
  • Views: 

    1216
  • Downloads: 

    0
Abstract: 

data streams outlier mining is an important and active research issue in anomaly detection. Outliers are large deviate from others data points. They are often not the errors, and may carry important information. Recently, many studies on outlier detection in the database are done. Many algorithms have been proposed to detect outliers, but most of them are effective on static data. As data streams evolve during the time, traditional methods cannot perform well on them. These algorithms often can lead us to a wrong decision. The false positive rate of the algorithms will be high. In this paper, an algorithm is proposed to divide the streams to pieces evenly and compute local outlier factor for every data. The proposed algorithm uses a list as candidate list for the outliers. The proposed algorithm detects outliers and unusual patterns by postponing at outlier detection. The experimental results on synthetic and real datasets show that the proposed algorithm was successful in reducing false positive rate and increasing its accuracy.

Yearly Impact: مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic Resources

View 1216

مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesDownload 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesCitation 0 مرکز اطلاعات علمی Scientific Information Database (SID) - Trusted Source for Research and Academic ResourcesRefrence 0
litScript
telegram sharing button
whatsapp sharing button
linkedin sharing button
twitter sharing button
email sharing button
email sharing button
email sharing button
sharethis sharing button